Skip to content

fix(pd): resolve hostname entries in IpAuthHandler allowlist#2962

Open
bitflicker64 wants to merge 1 commit intoapache:masterfrom
bitflicker64:fix-ipauthhandler-dns-resolution
Open

fix(pd): resolve hostname entries in IpAuthHandler allowlist#2962
bitflicker64 wants to merge 1 commit intoapache:masterfrom
bitflicker64:fix-ipauthhandler-dns-resolution

Conversation

@bitflicker64
Copy link
Contributor

@bitflicker64 bitflicker64 commented Mar 5, 2026

Purpose of the PR

IpAuthHandler only compared the client IP with the allowlist entries directly.
When the allowlist contains hostnames, connections from their resolved IPs could be rejected.

Main Changes

  • Resolve hostname allowlist entries using InetAddress.getAllByName
  • Store resolved addresses in a set used for connection validation
  • Keep runtime validation as a simple Set.contains() lookup

Verifying these changes

Does this PR potentially affect the following parts?

  • Dependencies
  • Modify configurations
  • The public API
  • Other affects
  • Nope

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

@bitflicker64
Copy link
Contributor Author

How I tested:

  1. Built a local Docker image from source with this fix applied (alongside the getLeaderGrpcAddress fix from fix(pd): add timeout and null-safety to getLeaderGrpcAddress() #2961)
  2. Removed static IP workaround from docker-compose — switched back to hostnames (pd0:8610, pd1:8610, pd2:8610) for all raft peers, gRPC hosts, and PD addresses
  3. Brought up the full 3-node cluster (3 PD + 3 Store + 3 Server) in bridge network mode with no ipam static IP config
  4. pd2 won leader election on first boot

Results with pd2 as leader and hostnames only:

partitionCount:12 on all 3 stores ✅
leaderCount:12 on all 3 stores ✅
{"graphs":["hugegraph"]} ✅
All 9 containers healthy ✅
No "Blocked connection" warnings in any PD logs ✅

Before this fix: IpAuthHandler compared raw hostname strings against incoming connection IPs — always failed in bridge mode, requiring static IPs as a workaround.

After this fix: Hostnames are resolved to IPs at startup via InetAddress.getAllByName() — static IP workaround no longer needed.

Related PR: #2952, #2961

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes PD Raft inbound connection allowlisting when the configured allowlist contains hostnames (e.g., Docker bridge service names) by resolving those hostnames to IPs up front and validating connections via a simple set lookup.

Changes:

  • Resolve allowlist entries via InetAddress.getAllByName() at handler construction time
  • Cache resolved IPs in an immutable Set used by isIpAllowed()
  • Log a warning when an allowlist hostname can’t be resolved

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return resolved.isEmpty() || resolved.contains(ip);
}

private static Set<String> resolveAll(Set<String> entries) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ This change introduces new hostname-resolution behavior, but the PR still has no regression coverage around it. Please add at least one test for a hostname allowlist entry and one for a resolution failure/refresh scenario, otherwise it will be very easy to regress this path again.


private IpAuthHandler(Set<String> allowedIps) {
this.allowedIps = Collections.unmodifiableSet(allowedIps);
this.resolvedIps = resolveAll(allowedIps);
Copy link
Member

@imbajin imbajin Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ resolvedIps is only built once in the constructor, but IpAuthHandler is still a JVM-wide singleton. RaftEngine#changePeerList() can replace the peer set after startup, and hostname-based peers can also resolve to a different IP later. In both cases this handler keeps the original resolved addresses forever, so a valid PD peer can be blocked until the whole process restarts. Please add a refresh path for membership/DNS changes, or defer hostname resolution to validation time with a bounded cache/TTL.

peer hostname
    |
    v
startup resolve once
    |
    v
resolvedIps (singleton, frozen)
    |
    +--> peer list changed
    |
    +--> DNS changed
            |
            v
      new peer IP not in set
            |
            v
        connection blocked

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] IpAuthHandler blocks all cross-node raft connections in Docker bridge mode — hostname vs IP mismatch

3 participants